Back

The American Journal of Human Genetics

77 training papers 2019-06-25 – 2026-03-07

Top medRxiv preprints most likely to be published in this journal, ranked by match strength.

1
A systematic assessment of the impact of rare canonical splice site variants on splicing using functional and in silico methods
2023-07-06 genetic and genomic medicine 10.1101/2023.06.29.23292012
#1 (30.3%)
Show abstract

Background/ObjectivesCanonical splice site variants (CSSVs) are often presumed to cause loss-of-function (LoF) and are assigned very strong evidence of pathogenicity (according to ACMG criterion PVS1). However, the exact nature and predictability of splicing effects of unselected rare CSSVs in blood-expressed genes is poorly understood. MethodsA total of 184 rare CSSVs in unselected blood-expressed genes were identified by genome sequencing in 121 individuals, and their impact on splicing was i...

2
When Two plus Four Does Not Equal Six: Combining Computational and Functional Evidence Towards Classification of BRCA1 Key Domain Missense Substitutions.
2024-10-10 genetic and genomic medicine 10.1101/2024.10.09.24315186
#1 (30.2%)
Show abstract

Classification of genetic variants remains an obstacle to realizing the full potential of clinical genetic sequencing. Because of their ability to interrogate large numbers of variants, multiplexed assays of variant effect (MAVEs) and computational tools are viewed as a critical part of the solution to variant classification uncertainty. However, the (joint) performance of these assays and tools on novel variants has not been established. Transformation of the qualitative classification guidelin...

3
Clinical application of Complete Long Read genome sequencing identifies a 16kb intragenic duplication in EHMT1 in a patient with suspected Kleefstra syndrome
2024-03-29 genetic and genomic medicine 10.1101/2024.03.28.24304304
#1 (30.2%)
Show abstract

Long read sequencing offers benefits for the detection of structural variation in Mendelian disease. Here, we applied a new technology that generates contiguous long reads via tagmentation and sequencing by synthesis to a small cohort of patients with undiagnosed disease from the Undiagnosed Diseases Network. We first compare sequencing from the HG002 benchmark sample from Genome In A Bottle using nanopore sequencing (R10.4.1, duplex reads, Oxford Nanopore), single molecule real time sequencing ...

4
Landscape of parental postzygotic mutations in >11,000 rare disease trios
2025-10-19 genetic and genomic medicine 10.1101/2025.10.17.25337713
#1 (25.1%)
Show abstract

Postzygotic mutations (PZMs) arising post-fertilisation, prior to primordial germ cell specification, may be subsequently inherited by both somatic and germ cells, causing somatic mosaicism in the parent as well as being passed to offspring. These early embryonic mutations often go undetected in clinical sequencing due to their low variant allele fraction (VAF) and are typically excluded by standard variant filtering pipelines. To overcome this limitation, we developed a bioinformatic approach t...

5
Implications of Genetic Distance to Reference and De Novo Genome Assembly for Clinical Genomics in Africans
2020-09-27 genetic and genomic medicine 10.1101/2020.09.25.20201780
#1 (24.8%)
Show abstract

In clinical genomics, variant calling from short-read sequencing data typically relies on a pan-genomic, universal human reference sequence. A major limitation of this approach is that the number of reads that incorrectly map or fail to map increase as the reads diverge from the reference sequence. In the context of genome sequencing of genetically diverse Africans, we investigate the advantages and disadvantages of using a de novo assembly of the read data as the reference sequence in single sa...

6
De novo variants in KDM2A cause a syndromic neurodevelopmental disorder
2025-04-02 genetic and genomic medicine 10.1101/2025.03.31.25324695
#1 (24.5%)
Show abstract

Germline variants that disrupt components of the epigenetic machinery cause syndromic neurodevelopmental disorders. Using exome and genome sequencing, we identified de novo variants in KDM2A, a lysine demethylase crucial for embryonic development, in 18 individuals with developmental delays and/or intellectual disabilities. The severity ranged from learning disabilities to severe intellectual disability. Other core symptoms included feeding difficulties, growth issues such as intrauterine growth...

7
Regional nonsense constraint offers clinical and biological insights into rare genetic disorders
2024-10-13 genetic and genomic medicine 10.1101/2024.10.10.24315185
#1 (24.5%)
Show abstract

Understanding the molecular consequences of variants which introduce premature termination codons (PTCs) is essential to predicting their clinical impact1. Although transcripts with PTCs are generally expected to undergo nonsense-mediated mRNA decay (NMD)2-4, up to 45% of all possible PTCs in the human genome are predicted to escape NMD5. Existing studies of constraint against predicted loss-of-function variants at the transcript level6-10 do not account for regional differences in NMD efficienc...

8
Massively parallel identification of functionally consequential noncoding genetic variants in undiagnosed rare disease patients
2021-11-04 genetic and genomic medicine 10.1101/2021.11.02.21265771
#1 (24.3%)
Show abstract

Clinical whole genome sequencing has enabled the discovery of potentially pathogenic noncoding variants in the genomes of rare disease patients with a prior history of negative genetic testing. However, interpreting the functional consequences of noncoding variants and distinguishing those that contribute to disease etiology remains a challenge. Here we address this challenge by experimentally profiling the functional consequences of rare noncoding variants detected in a cohort of undiagnosed ra...

9
Proteomic and clinical impact of human knockouts in British South Asians
2025-10-15 genetic and genomic medicine 10.1101/2025.10.15.25337977
#1 (24.2%)
Show abstract

Human loss-of-function (LoF) variants affecting both copies of a gene ("human knockouts") provide a unique opportunity to directly study function and clinical impact of genes but are very rare in most populations sequenced to date. Here we study 1,569 British Bangladeshi and -Pakistani adults who were recalled for plasma sampling for proteomic profiling using three distinct technologies (covering >12,000 proteins) from 55k whole exome sequenced Genes & Health adults - a cohort enriched for rare,...

10
Benchmarking autosomal recessive disease prevalence estimation from allele frequencies against newborn screening data
2025-10-13 genetic and genomic medicine 10.1101/2025.10.11.25337773
#1 (24.2%)
Show abstract

Accurate estimates for the prevalence of rare congenital diseases are critical for understanding disease epidemiology and enabling drug development. Prevalence estimates can inform public health investment, identify communities with high disease burden or underdiagnosis, and reveal areas of unmet clinical need. With the advent of global-scale biobanks, genetics-based models to estimate the prevalence of disease have become viable. Autosomal recessive (AR) rare diseases are particularly tractabl...

11
The impact of rare pathogenic CNVs is exacerbated by assortative mating.
2025-09-12 genetic and genomic medicine 10.1101/2025.09.08.25335316
#1 (24.1%)
Show abstract

Copy-number variants (CNVs) are linked to a spectrum of outcomes and carriers of the same variant exhibit variable disease severity. We explored the impact of an individuals polygenic score (PGS) on explaining these differences, focusing on 119 established CNV-trait associations involving 43 clinically-relevant phenotypes. We called CNVs among white British UK Biobank participants, then divided samples into a training set (n = 264,372) to derive independent PGS weights, and a CNV-carrier-enriche...

12
Combining MAVEs and computational predictors improves variant classification across ancestries in hereditary cancer genes
2025-12-09 genetic and genomic medicine 10.64898/2025.12.08.25341119
#1 (24.0%)
Show abstract

Many commonly used computational tools for variant effect prediction exhibit ancestry-related bias because they are trained on clinical or population datasets that under-represent global diversity, leading to uneven and sometimes unfair variant classification across ancestries. Multiplexed assays of variant effect (MAVEs) and population-free VEPs instead offer alternatives that are unbiased with respect to human ancestry, providing classification evidence that generalises across populations. Her...

13
Systematic identification of disease-causing promoter and untranslated region variants in 8,040 undiagnosed individuals with rare disease
2023-09-12 genetic and genomic medicine 10.1101/2023.09.12.23295416
#1 (23.9%)
Show abstract

BackgroundBoth promoters and untranslated regions (UTRs) have critical regulatory roles, yet variants in these regions are largely excluded from clinical genetic testing due to difficulty in interpreting pathogenicity. The extent to which these regions may harbour diagnoses for individuals with rare disease is currently unknown. MethodsWe present a framework for the identification and annotation of potentially deleterious proximal promoter and UTR variants in known dominant disease genes. We us...

14
Long-read transcriptome analysis using IsoRanker for identifying pathogenic variants in Mendelian conditions
2025-11-13 genetic and genomic medicine 10.1101/2025.11.07.25339764
#1 (23.9%)
Show abstract

Identifying pathogenic non-coding variants that contribute to Mendelian conditions remains challenging as the functional impact of these variants on gene function is often unknown. We present IsoRanker, a long-read transcriptome sequencing-based framework that prioritizes functionally relevant non-coding variants by detecting genes and novel isoforms with outlier expression, allelic imbalance, and/or nonsense-mediated decay (NMD). We generated paired cycloheximide-treated and untreated fibroblas...

15
From Uncertain to Actionable: Significant Reduction in Variants of Uncertain Significance in Hereditary Germline Testing via Multi-Institutional Real-World Evidence
2025-08-15 genetic and genomic medicine 10.1101/2025.08.12.25333547
#1 (23.8%)
Show abstract

The clinical utility of genomic testing is constrained by variants of uncertain significance (VUS), which complicate diagnostic interpretation and patient management. The ACMG/AMP PS4 criterion, "prevalence in affected individuals statistically increased compared to controls," offers strong evidence for pathogenicity but is often challenging to apply due to the limited availability of robust, matched case-control genomic and phenotypic data. Further, there are currently no options available to s...

16
Potential Misrepresentation of Inherited Breast Cancer Risk by Common Germline Alleles
2022-10-22 genetic and genomic medicine 10.1101/2022.10.21.22281361
#1 (23.7%)
Show abstract

Hundreds of common variants have been found to confer small but significant differences in breast cancer risk, supporting the polygenic additive model of inherited risk. This widely accepted model is at odds with twin data indicating highly elevated risk in a subgroup of women. Using a novel closed-pattern-mining algorithm, we provide evidence that rare variants or haplotypes may underlie the association of breast cancer risk with common germline alleles. Our method, called Chromosome Overlap, c...

17
Pharmacogenomic variant profiling in 14,490 Koreans using a population-specific genotyping array
2026-01-26 genetic and genomic medicine 10.64898/2026.01.19.26344411
#1 (23.7%)
Show abstract

Pharmacogenomics is an essential component of precision medicine; however, most existing knowledge has been derived from populations of European ancestry, limiting the understanding of pharmacogenomic diversity in East Asian populations. In this study, we applied genotype imputation to the Korea Biobank Array v2.0 using a reference panel of 8,062 Korean whole-genome sequencing (WGS) samples and analyzed pharmacogenomic variants and phenotypes in 14,490 Korean individuals. To assess the accuracy ...

18
Ehmt2 Loss-Of-Function Alterations Cause A Kleefstra-Like Syndrome
2024-01-11 genetic and genomic medicine 10.1101/2024.01.10.24300997
#1 (23.7%)
Show abstract

Dysregulation of the epigenetic machinery is associated with neurodevelopmental defects in humans. Kleefstra syndrome (KS) is a neurodevelopmental syndrome caused by heterozygous alterations in the gene EHMT1 that cause loss-of-function. EHTM1 and EHMT2 are highly similar histone methyltransferases that play relevant roles in development. Despite their similarity, individuals with alterations in EHMT2 have never been described. Here, we describe a pediatric patient with a KS-overlapping phenotyp...

19
A systematic analysis of splicing variants identifies new diagnoses in the 100,000 Genomes Project.
2022-01-31 genetic and genomic medicine 10.1101/2022.01.28.22270002
#1 (23.6%)
Show abstract

Genomic variants which disrupt splicing are a major cause of rare genetic disease. However, variants which lie outside of the canonical splice sites are difficult to interpret clinically. Here, we examine the landscape of splicing variants in whole-genome sequencing data from 38,688 individuals in the 100,000 Genomes Project, and assess the contribution of non-canonical splicing variants to rare genetic diseases. We show that splicing branchpoints are highly constrained by purifying selection, a...

20
The Utility of Ultra-Deep RNA sequencing in Mendelian Disorder Diagnostics
2025-01-29 genetic and genomic medicine 10.1101/2025.01.28.25321295
#1 (23.5%)
Show abstract

Clinical RNA-seq has become an essential tool for resolving variants of uncertain significance (VUS), particularly those affecting gene expression and splicing. However, most reference data and diagnostic protocols employ relatively modest sequencing depths ([~]50-150 million reads), which may fail to capture low-abundance transcripts and rare splicing events critical for accurate diagnoses. We evaluated the diagnostic and translational utility of ultra-high-depth (up to [~]one billion unique re...